P2P Domain Classification using Decision Tree
نویسندگان
چکیده
The increasing interest in Peer-to-Peer systems (such as Gnutella) has inspired many research activities in this area. Although many demonstrations have been performed that show that the performance of a Peer-to-Peer system is highly dependent on the underlying network characteristics, much of the evaluation of Peer-to-Peer proposals has used simplified models that fail to include a detailed model of the underlying network. This can be largely attributed to the complexity in experimenting with a scalable Peer-to-Peer system simulator built on top of a scalable network simulator. A major problem of unstructured P2P systems is their heavy network traffic. In Peer-to-Peer context, a challenging problem is how to find the appropriate peer to deal with a given query without overly consuming bandwidth? Different methods proposed routing strategies of queries taking into account the P2P network at hand. This paper considers an unstructured P2P system based on an organization of peers around Super-Peers that are connected to Super-SuperPeer according to their semantic domains; in addition to integrating Decision Trees in P2P architectures to produce Query-Suitable Super-Peers, representing a community of peers where one among them is able to answer the given query. By analyzing the queries log file, a predictive model that avoids flooding queries in the P2P network is constructed after predicting the appropriate Super-Peer, and hence the peer to answer the query. A challenging problem in a schema-based Peer-to-Peer (P2P) system is how to locate peers that are relevant to a given query. In this paper, architecture, based on (Super-)Peers is proposed, focusing on query routing. The approach to be implemented, groups together (Super-)Peers that have similar interests for an efficient query routing method. In such groups, called Super-Super-Peers (SSP), Super-Peers submit queries that are often processed by members of this group. A SSP is a specific Super-Peer which contains knowledge about: 1. its Super-Peers and 2. The other SSP. Knowledge is extracted by using data mining techniques (e.g. Decision Tree algorithms) starting from queries of peers that transit on the network. The advantage of this distributed knowledge is that, it avoids making semantic mapping between heterogeneous data sources owned by (Super-)Peers, each time the system decides to route query to other (Super-) Peers. The set of SSP improves the robustness in queries routing mechanism, and the scalability in P2P Network. Compared with a baseline approach,the proposal architecture shows the effect of the data mining with better performance in respect to response time and precision.
منابع مشابه
Comparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملComparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملPeer-to-Peer IP Traffic Classification Using Decision Tree and IP Layer Attributes
We present a new approach using data-mining technique and, in particular, decision tree to classify peer-to-peer (P2P) traffic in IP networks. We captured the Internet traffic at a main gateway router, performed preprocessing on the data, selected the most significant attributes, and prepared a training-data set to which the decision-tree algorithm was applied. We built several models using a c...
متن کاملClassification of Peer-to-Peer Traffic Using A Two-Stage Window-Based Classifier With Fast Decision Tree and IP Layer Attributes
This paper presents a new approach using data mining techniques, and in particular a two-stage architecture, for classification of Peer-to-Peer (P2P) traffic in IP networks where in the first stage the traffic is filtered using standard port numbers and layer 4 port matching to label well-known P2P and NonP2P traffic. The labeled traffic produced in the first stage is used to train a Fast Decis...
متن کاملSteel Buildings Damage Classification by damage spectrum and Decision Tree Algorithm
Results of damage prediction in buildings can be used as a useful tool for managing and decreasing seismic risk of earthquakes. In this study, damage spectrum and C4.5 decision tree algorithm were utilized for damage prediction in steel buildings during earthquakes. In order to prepare the damage spectrum, steel buildings were modeled as a single-degree-of-freedom (SDOF) system and time-history...
متن کاملAdoption of a Fuzzy Based Classification Model for P2P Botnet Detection
Botnet threat has increased enormously with adoption of newer technologies like root kit, anti-antivirus modules etc. by the hackers. Emergence of botnets having distributed C & C structure that mimic P2P technologically, has made its detection and dismantling extremely difficult. However, numeric flow feature values of P2P botnet C & C traffic can be used to generate fuzzy rule-set which can t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1109.1147 شماره
صفحات -
تاریخ انتشار 2011